Analysis of Flight Delay Data
Nov 20, 2024
Introduce Bayesian Linear Regression (BLR): Understand its principles and how it differs from traditional methods.
Explain Bayesian Concepts: Highlight Bayes’ Theorem, prior knowledge, and posterior distributions.
Discuss Practical Applications: Show how BLR is applied in analyzing real-world data, like airline delays.
Explore Advantages of Bayesian Methods: Quantifying uncertainty, improving predictions, and handling complex data.
Present Analysis Findings: Summarize key insights from our BLR model on time and weather-related airline delays.
BLR: A statistical approach combining prior knowledge and new data.
Goal: Model relationships, make predictions, and handle uncertainty in estimates.
Difference from Traditional Methods: Probability-based estimates instead of fixed values.
Advantages of Bayesian Linear Regression[1]
Incorporation of Prior Knowledge
Uncertainty Quantification
Expanded Hypotheses
Automatic Meta-Analyses
Improved Handling of Small Samples
Complex Model Estimation
Model Specification: Define the linear relationship between the dependent and independent variables.
Choose Priors: Select prior distributions for the model parameters, reflecting any existing knowledge about their values.
Data Collection: Gather relevant data for the variables in the model.
Model Fitting: Use computational methods, such as Markov Chain Monte Carlo (MCMC), to estimate the posterior distributions of the parameters based on the observed data.
Result Interpretation: Analyze the posterior distributions to understand the relationships between variables, including estimating means and credible intervals.
Y = \beta_0 + \beta_1X + \varepsilon
p(B|A) = \frac{p(A|B)\cdot p(B)}{p(A)}
Bayesian Inference can be written simply [4]
Posterior = \frac{Likelihood \times Prior}{Normalization}
\begin{align*} p(\theta|y) =& \frac{ L(\theta|y)p(\theta) }{p(y)}\\ \\ p(\theta|y) \propto & \text{ }L(\theta|y)p(\theta) \end{align*}
Figure from [3]
What is MCMC?
How Does It Work?
\begin{align*} Y_i|\beta_0, \beta_1, \sigma &\overset{\text{ind}}{\sim} N (\mu_i, \sigma^2) && \text{with } && \mu_i = \beta_0 + \beta_1X_i \end{align*} Where:
The model can be written as
\begin{align*} Y_i|\beta_0, \beta_1, \sigma &\overset{\text{ind}}{\sim} N (\mu_i, \sigma^2) && \text{with } && \mu_i = \beta_0 + \beta_1X_i \\ \beta_{0} &\sim N(m_0, s_0^2)\\ \beta_1 &\sim N(m_1, s_1^2)\\ \sigma &\sim \text{Exp}(l) \end{align*}
Regression parameters
\begin{align*} Y_i|\beta_0, \beta_1, ... \beta_6, \sigma &\overset{\text{ind}}{\sim} N (\mu_i, \sigma^2) && \text{with } && \mu_i = \beta_0 + \beta_1X_{i1} + \beta_2X_{i2} + ... \beta_6X_{i6} \\ \beta_{0} &\sim N(m_0, s_0^2)\\ \beta_1 &\sim N(m_1, s_1^2)\\ \sigma &\sim \text{Exp}(l) \end{align*}
Where
\beta_{0c} \sim N(2, 36^2)
\beta_{1} \sim N(0.02, 0.01^2)
\sigma \sim \text{Exp}(0.02)
Model 1
\begin{align*} Y_i|\beta_0, \beta_1, \sigma &\overset{\text{ind}}{\sim} N (\mu_i, \sigma^2) && \text{with } && \mu_i = \beta_0 + \beta_1X_i \\ \beta_{0} &\sim N(2, 36^2)\\ \beta_1 &\sim N(0.02, 0.01^2)\\ \sigma &\sim \text{Exp}(0.02) \end{align*}
Model 2
\begin{align*} Y_i|\beta_0, \beta_1, ... \beta_6, \sigma &\overset{\text{ind}}{\sim} N (\mu_i, \sigma^2) && \text{with } && \mu_i = \beta_0 + \beta_1X_{i1} + \beta_2X_{i2} + ... \beta_6X_{i6} \\ \beta_{0} &\sim N(2, 46^2)\\ \beta_j &\sim N(0, 50^2)\\ \sigma &\sim \text{Exp}(0.02) \end{align*}
Prior Selection
Intercept (\beta_0): \beta_0 \sim N(0, 5^2) Assumes no strong baseline effect.
Slope (\beta_1): \beta_1 \sim N(0, 5^2) Reflects no strong prior belief about the relationship between weather incidents and delays.
Error Term (\sigma): \sigma \sim \text{Exp}(1) Accounts for variability in delays; allows flexibility.
Model Specification
Y_i \mid \beta_0, \beta_1, \sigma \sim N(\mu_i, \sigma^2) \mu_i = \beta_0 + \beta_1 X_i
| Header | Description |
|---|---|
| Fl Date | Flight Date (yyyy-mm-dd) |
| Airline | Airline Name |
| Airline DOT | Airline Name and Unique Carrier Code. When the same code has been used by multiple carriers, a numeric suffix is used for earlier users, for example, PA, PA(1), PA(2). Use this field for analysis across a range of years. |
| Airline Code | Unique Carrier Code |
| DOT Code | An identification number assigned by US DOT to identify a unique airline (carrier). A unique airline (carrier) is defined as one holding and reporting under the same DOT certificate regardless of its Code, Name, or holding company/corporation. |
| Fl Number | Flight Number |
| Origin | Origin Airport, Airport ID. An identification number assigned by US DOT to identify a unique airport. Use this field for airport analysis across a range of years because an airport can change its airport code and airport codes can be reused. |
| Origin City | Origin City Name, State Code |
| Dest | Destination Airport, Airport ID. An identification number assigned by US DOT to identify a unique airport. Use this field for airport analysis across a range of years because an airport can change its airport code and airport codes can be reused. |
| Dest City | Destination City Name, State Code |
| CRS Dep Time | CRS Departure Time (local time: hhmm) |
| Dep Time | Actual Departure Time (local time: hhmm) |
| Dep Delay | Difference in minutes between scheduled and actual departure time. Early departures show negative numbers. |
| Taxi Out | Taxi Out Time, in Minutes |
| Wheels Off | Wheels Off Time (local time: hhmm) |
| Wheels On | Wheels On Time (local time: hhmm) |
| Taxi In | Taxi In Time, in Minutes |
| CRS Arr Time | CRS Arrival Time (local time: hhmm) |
| Arr Time | Actual Arrival Time (local time: hhmm) |
| Arr Delay | Difference in minutes between scheduled and actual arrival time. Early arrivals show negative numbers. |
| Cancelled | Cancelled Flight Indicator (1=Yes) |
| Cancellation Code | Specifies The Reason For Cancellation |
| Diverted | Diverted Flight Indicator (1=Yes) |
| CRS Elapsed Time | CRS Elapsed Time of Flight, in Minutes |
| Actual Elapsed Time | Elapsed Time of Flight, in Minutes |
| Air Time | Flight Time, in Minutes |
| Distance | Distance between airports (miles) |
| Carrier Delay | Carrier Delay, in Minutes |
| Weather Delay | Weather Delay, in Minutes |
| NAS Delay | National Air System Delay, in Minutes |
| Security Delay | Security Delay, in Minutes |
| Late Aircraft Delay | Late Aircraft Delay, in Minutes |
| Table 1: Flight Delay Summary by Flight Period | ||||
|---|---|---|---|---|
| Flight Period |
Flight Period
|
|||
| Morning | Afternoon | Evening | Total | |
| TotalFlightsCount | 1246031 (41.5%) | 1423140 (47.4%) | 330829 (11.0%) | 3000000 (100%) |
| CancelledFlightsCount | 30690 (38.8%) | 38343 (48.4%) | 10107 (12.8%) | 79140 (100%) |
| DivertedFlightsCount | 2555 (36.2%) | 3901 (55.3%) | 600 (8.5%) | 7056 (100%) |
| AvgCRSDepTime | 08:49:31 | 15:73:19 | 20:66:23 | 13:27:04 |
| AvgDepTime | 08:53:58 | 15:89:05 | 20:12:40 | 13:29:47 |
| AvgDepDelay | 5.23 | 12.93 | 16.51 | 10.12 |
| AvgTaxiOut | 16.87 | 16.44 | 16.65 | 16.64 |
| AvgTaxiIn | 7.75 | 7.78 | 6.95 | 7.68 |
| AvgCRSArrTime | 10:87:15 | 17:85:11 | 17:42:14 | 14:90:34 |
| AvgArrTime | 10:86:01 | 17:71:56 | 15:89:47 | 14:66:31 |
| AvgArrDelay | -0.77 | 7.34 | 10.04 | 4.26 |
| AvgAirTime | 114.12 | 109.8 | 116.31 | 112.31 |
| CarrierDelayCount | 86824 (29.2%) | 162266 (54.6%) | 47861 (16.1%) | 296951 (100%) |
| SecurityDelayCount | 887 (32.1%) | 1434 (52.0%) | 438 (15.9%) | 2759 (100%) |
| WeatherDelayCount | 8380 (26.7%) | 18758 (59.7%) | 4290 (13.7%) | 31428 (100%) |
| NASDelayCount | 80604 (31.4%) | 144366 (56.3%) | 31507 (12.3%) | 256477 (100%) |
| LateAircraftDelayCount | 42721 (16.5%) | 168902 (65.2%) | 47391 (18.3%) | 259014 (100%) |
| Summary includes morning, afternoon, and evening flight periods. | ||||
\begin{align*} Y_i|\beta_0, \beta_1, \sigma &\overset{\text{ind}}{\sim} N (\mu_i, \sigma^2) && \text{with } && \mu_i = \beta_0 + \beta_1X_i \\ \beta_{0} &\sim N(2, 36^2)\\ \beta_1 &\sim N(0.02, 0.01^2)\\ \sigma &\sim \text{Exp}(0.02) \end{align*}
\begin{align*} Y_i|\beta_0, \beta_1, ... \beta_6, \sigma &\overset{\text{ind}}{\sim} N (\mu_i, \sigma^2) && \text{with } && \mu_i = \beta_0 + \beta_1X_{i1} + \beta_2X_{i2} + ... \beta_6X_{i6} \\ \beta_{0} &\sim N(2, 46^2)\\ \beta_j &\sim N(0, 50^2)\\ \sigma &\sim \text{Exp}(0.02) \end{align*}
| Table 2. Estimations of the Posterior Distributions’ Regression Coefficients. | ||||
|---|---|---|---|---|
| Mean | SE | 95% CI | ||
| Model 1: Continuous Predictor | ||||
| Flat Priors | 𝛽₀ Intercept | -10.92 | 0.47 | (-11.85; -10.00) |
| 𝛽₁ Departure Time | 0.02 | 0.00 | (0.02; 0.02) | |
| 𝜎 | 51.09 | 0.12 | (50.86; 51.33) | |
| Default Tuned Priors | 𝛽₀ Intercept | -10.94 | 0.47 | (-11.86; -10.01) |
| 𝛽₁ Departure Time | 0.02 | 0.00 | (0.02; 0.02) | |
| 𝜎 | 51.09 | 0.12 | (50.86; 51.32) | |
| Tuned Priors | 𝛽₀ Intercept | -11.66 | 0.17 | (-12.02; -11.32) |
| 𝛽₁ Departure Time | 0.02 | 0.00 | (0.02; 0.02) | |
| 𝜎 | 51.10 | 0.12 | (50.87; 51.33) | |
| Model 2: Categorical Predictor | ||||
| Flat Priors | 𝛽₀ Intercept (Tuesday) | 1.57 | 0.44 | (0.69; 2.43) |
| 𝛽₅ Sunday | 4.36 | 0.61 | (3.21; 5.56) | |
| 𝛽₆ Monday | 2.93 | 0.65 | (1.72; 4.21) | |
| 𝛽₁ Wednesday | 4.92 | 0.61 | (3.72; 6.13) | |
| 𝛽₂ Thursday | 2.66 | 0.65 | (1.47; 3.88) | |
| 𝛽₃ Friday | 1.86 | 0.62 | (0.69; 3.05) | |
| 𝛽₄ Saturday | 3.43 | 0.61 | (2.25; 4.66) | |
| 𝜎 | 51.39 | 0.12 | (51.16; 51.62) | |
| Default Tuned Priors | 𝛽₀ Intercept (Tuesday) | 1.54 | 0.40 | (0.73; 2.43) |
| 𝛽₅ Sunday | 3.48 | 0.58 | (2.31; 4.70) | |
| 𝛽₆ Monday | 1.89 | 0.60 | (0.74; 3.04) | |
| 𝛽₁ Wednesday | 4.40 | 0.60 | (3.21; 5.58) | |
| 𝛽₂ Thursday | 2.69 | 0.60 | (1.48; 3.82) | |
| 𝛽₃ Friday | 2.97 | 0.64 | (1.75; 4.18) | |
| 𝛽₄ Saturday | 4.96 | 0.59 | (3.77; 6.13) | |
| 𝜎 | 51.40 | 0.12 | (51.17; 51.62) | |
| Tuned Priors | 𝛽₀ Intercept (Tuesday) | 1.54 | 0.44 | (0.66; 2.43) |
| 𝛽₅ Sunday | 4.41 | 0.61 | (3.23; 5.66) | |
| 𝛽₆ Monday | 2.99 | 0.64 | (1.73; 4.25) | |
| 𝛽₁ Wednesday | 4.96 | 0.64 | (3.76; 6.18) | |
| 𝛽₂ Thursday | 2.70 | 0.61 | (1.48; 3.89) | |
| 𝛽₃ Friday | 1.91 | 0.64 | (0.71; 3.15) | |
| 𝛽₄ Saturday | 3.50 | 0.63 | (2.28; 4.73) | |
| 𝜎 | 51.39 | 0.12 | (51.17; 51.62) | |
| Table 3. Posterior predictive results from k-fold cross validation. | ||||
|---|---|---|---|---|
| MAE | MAE Scaled | Within 50% | Within 95% | |
| Model 1: Continuous Predictor | ||||
| Flat Priors | 15.730 | 0.313 | 0.841 | 0.966 |
| Default Tuned Priors | 15.779 | 0.314 | 0.840 | 0.966 |
| Tuned Priors | 15.668 | 0.312 | 0.849 | 0.966 |
| Model 2: Categorical Predictor | ||||
| Flat Priors | 17.110 | 0.338 | 0.866 | 0.965 |
| Default Tuned Priors | 17.080 | 0.338 | 0.866 | 0.966 |
| Tuned Priors | 17.118 | 0.339 | 0.867 | 0.966 |
| Table 4. Effective sample size ratios for Model 1. | |||
|---|---|---|---|
| Priors | 𝛽₀ Intercept | 𝛽₁ Departure Time | 𝜎 |
| Flat | 0.83 | 1.19 | 0.48 |
| Default | 0.80 | 1.10 | 0.80 |
| Tuned | 0.64 | 0.71 | 1.04 |
| Table 5. Effective sample size ratios for Model 2. | ||||||||
|---|---|---|---|---|---|---|---|---|
| Priors | 𝛽₀ Intercept (Tuesday) | 𝛽₅ Sunday | 𝛽₆ Monday | 𝛽₁ Wednesday | 𝛽₂ Thursday | 𝛽₃ Friday | 𝛽₃ Friday | 𝜎 |
| Flat | 0.27 | 0.35 | 0.36 | 0.36 | 0.38 | 0.36 | 0.37 | 2.92 |
| Default | 0.37 | 0.59 | 0.55 | 0.60 | 0.61 | 0.56 | 0.58 | 0.66 |
| Tuned | 0.47 | 0.97 | 0.93 | 0.97 | 0.89 | 1.00 | 0.95 | 0.21 |
| Table 6. R-hat metric for Model 1. | |||
|---|---|---|---|
| Priors | 𝛽₀ Intercept | 𝛽₁ Departure Time | 𝜎 |
| Flat | 0.9997 | 0.9996 | 1.0015 |
| Default | 1.0008 | 1.0005 | 1.0004 |
| Tuned | 1.0004 | 0.9996 | 1.0004 |
| Table 7. R-hat metric for Model 2. | ||||||||
|---|---|---|---|---|---|---|---|---|
| Priors | 𝛽₀ Intercept (Tuesday) | 𝛽₅ Sunday | 𝛽₆ Monday | 𝛽₁ Wednesday | 𝛽₂ Thursday | 𝛽₃ Friday | 𝛽₃ Friday | 𝜎 |
| Flat | 1.0045 | 1.0036 | 1.0024 | 1.0027 | 1.0027 | 1.0028 | 1.0012 | 0.9994 |
| Default | 1.0015 | 0.9995 | 1.0002 | 1.0001 | 1.0015 | 0.9996 | 1.0010 | 0.9995 |
| Tuned | 1.0003 | 1.0008 | 0.9996 | 0.9995 | 1.0002 | 0.9997 | 0.9998 | 1.0056 |
| Table 8. Model 1 Comparison: Bayesian and OLS | |||
|---|---|---|---|
| Estimate | SE | ||
| Model 1: Continuous Predictor | |||
| Default Tuned Priors |
𝛽₀ Intercept | -10.94 | 0.47 |
| 𝛽₁ Departure Time | 0.02 | 0.00 | |
| 𝜎 | 51.09 | 0.12 | |
| OLS Model: Continuous Predictor | |||
| 𝛽₀ Intercept | -10.92 | 0.47 | |
| 𝛽₁ Departure Time | 0.02 | 0.00 | |
| Residual Standard Error | 51.09 | ||
| Table 9. Model 2 Comparison: OLS and Bayesian | |||
|---|---|---|---|
| Estimate | SE | ||
| Model 2: Categorical Predictor | |||
| Default Tuned Priors |
𝛽₀ Intercept (Tuesday) | 1.54 | 0.40 |
| 𝛽₅ Sunday | 3.48 | 0.58 | |
| 𝛽₆ Monday | 1.89 | 0.60 | |
| 𝛽₁ Wednesday | 4.40 | 0.60 | |
| 𝛽₂ Thursday | 2.69 | 0.60 | |
| 𝛽₃ Friday | 2.97 | 0.64 | |
| 𝛽₄ Saturday | 4.96 | 0.59 | |
| 𝜎 | 51.40 | 0.12 | |
| OLS Model: Categorical Predictor | |||
| 𝛽₀ Intercept (Tuesday) | 1.55 | 0.44 | |
| 𝛽₅ Sunday | 4.95 | 0.62 | |
| 𝛽₆ Monday | 2.68 | 0.62 | |
| 𝛽₁ Wednesday | 1.88 | 0.62 | |
| 𝛽₂ Thursday | 3.47 | 0.62 | |
| 𝛽₃ Friday | 4.40 | 0.61 | |
| 𝛽₄ Saturday | 2.96 | 0.64 | |
| Residual Standard Error | 51.39 | ||
| Variable | Description |
|---|---|
| year | The year of the data. |
| month | The month of the data. |
| carrier | Carrier code. |
| carrier_name | Carrier name. |
| airport | Airport code. |
| airport_name | Airport name. |
| arr_flights | Number of arriving flights. |
| arr_del15 | Flights delayed by 15+ minutes. |
| carrier_ct | Carrier-caused delays. |
| weather_ct | Weather-caused delays. |
| nas_ct | NAS-related delays. |
| security_ct | Security-caused delays. |
| late_aircraft_ct | Delays from late aircraft. |
| arr_cancelled | Number of canceled flights. |
| arr_diverted | Number of diverted flights. |
| arr_delay | Total arrival delay. |
| carrier_delay | Delay attributed to the carrier. |
| weather_delay | Delay attributed to weather. |
| nas_delay | Delay attributed to the NAS. |
| security_delay | Delay attributed to security. |
| late_aircraft_delay | Delay from late-arriving aircraft. |
| Characteristic | Value |
|---|---|
| Total Months of Data (August) | 1.00 |
| Total Carriers | 21.00 |
| Total Arrived Flights (Count Data) | 62,146,805.00 |
| Total Delayed Flights (15+ min) | 11,375,095.00 |
| - Carrier Delays (31.34%) | 3,565,080.59 |
| - Weather Delays (3.39%) | 385,767.94 |
| - NAS Delays (29.21%) | 3,322,432.52 |
| - Security Delays (0.24%) | 26,930.39 |
| - Late Aircraft Delays (35.82%) | 4,074,891.00 |
| Total Cancelled Flights | 1,290,923.00 |
| Total Diverted Flights | 148,007.00 |
| Cancelled Flights (%) | 2.08 |
| Diverted Flights (%) | 0.24 |
| Parameter | Estimate | Standard Error | 95% Credible Interval |
|---|---|---|---|
| Intercept | -2116.53 | 7.67 | [-2131.41, -2100.91] |
| Weather Count | 1041.97 | 2.66 | [1036.73, 1047.15] |
| Sigma | 8676.19 | 15.52 | [8646.95, 8706.92] |
| Statistic | Value |
|---|---|
| Number of Observations | 171,426 |
| Model Family | Gaussian |
| Formula | arr_delay ~ weather_ct |
| Iterations | 2000 |
| Warmup | 1000 |
| Chains | 4 |
| Effective Sample Size (Bulk) [Intercept, Weather Count] | [2102.722, 2000.139] |
| Effective Sample Size (Tail) [Intercept, Weather Count] | [2095.692, 1858.849] |
| Mean Arrival Delay (minutes) | 1041.966 |
| Median Arrival Delay (minutes) | 1041.971 |
| Standard Deviation of Arrival Delay | 2.660956 |
| 95% Credible Interval for Mean Arrival Delay | [1036.731, 1047.15] |
Y = -11.66 + 0.02X.
Y = 1.54 + 4.96X_1 + 2.70X_2 + 1.91X_3 + 3.50X_4 + 4.41X_5 + 2.99X_6.
K-Fold Cross Validation
Effective Sample Size
\hat{R} \approx 1
MCMC Diagnostics
Autocorrelation
Intercept: -2116.53 (95% CI: [-2131.41, -2100.91])
Weather Count Coefficient: 1041.97 (95% CI: [1036.73, 1047.15])
A 1-unit increase in weather incidents leads to an average 1042-minute delay.
Weather incidents are infrequent but highly disruptive.
Uncertainty Measures:
Residual variability: Standard deviation = 8676.19.
Suggests other unmeasured factors affecting delays.
Model Diagnostics:
Rhat = 1.00 for all parameters, indicating convergence.
Large effective sample sizes ensure reliable posterior estimates.
Key Insight:
Weather-related incidents, though infrequent, have a disproportionately large impact on delay times.
Highlights the need for better weather management and forecasting.
Bayesian Approach:
Accounts for uncertainty, providing credible intervals for estimates.
Supports informed decision-making in airline operations and policy-making.
What other factors could be included in the model?
How could expanding the dataset improve insights?
What advanced Bayesian methods could be explored?
How should outliers be addressed?
What assumptions should be revisited?
\beta_{0c} reflects the typical arrival delay at a typical departure time. With a mean departure time at \sim 1:30pm, the average arrival delay is \sim 2 minutes with a standard deviation \sim 36 minutes.
\beta_{0c} \sim N(2, 36^2)
The slope of the linear model indicates a 0.019 minute increase in arrival delay per minute increase in departure time, so we set m_1 = 0.02. The standard error reflects high confidence at 0.0005, but as to not limit the model we will set it lower at s_1 = 0.01.
\beta_{1} \sim N(0.02, 0.01^2)
To tune the exponential model, we set the expected value of the standard deviation, E(\sigma), equal to the residual standard error, \sim 50. With this, we can find the rate parameter, l.
\begin{align*} E(\sigma) &= \frac{1}{l} = 50\\\\ l &= \frac{1}{50} = 0.02\\\\ \sigma &\sim \text{Exp}(0.02) \end{align*}
\begin{align*} Y_i|\beta_0, \beta_1, \sigma &\overset{\text{ind}}{\sim} N (\mu_i, \sigma^2) && \text{with } && \mu_i = \beta_0 + \beta_1X_i \\ \beta_{0} &\sim N(2, 36^2)\\ \beta_1 &\sim N(0.02, 0.01^2)\\ \sigma &\sim \text{Exp}(0.02) \end{align*}
For arrival delays by the day of the week, the Figure 9 shows mean arrival delays are between 1 and 7 minutes while the median arrival delays are all in the negative, indicating a skew towards larger delays.
\beta_{0} reflects the mean arrival delay on Tuesday, our reference. The average arrival delay is \sim 2 minutes with a standard deviation \sim 46 minutes.
\beta_{0} \sim N(2, 46^2)
For a categorical predictor with the stan_glm() function, the tuned prior, \beta_j, is applied to to the estimation of each coefficient associated with the individual levels of the predictor ($_1, _2, …, _6 $). For this reason, we set the coefficient prior to be weakly informative.
\beta_{j} \sim N(0, 50^2)
To tune the exponential model, we set the expected value of the standard deviation, E(\sigma), equal to the residual standard error which is the same as with the previous model, \sim 50.
\begin{align*} E(\sigma) &= \frac{1}{l} = 50\\\\ l &= \frac{1}{50} = 0.02\\\\ \sigma &\sim \text{Exp}(0.02) \end{align*}
The tuned model is as follows,
\begin{align*} Y_i|\beta_0, \beta_1, ... \beta_6, \sigma &\overset{\text{ind}}{\sim} N (\mu_i, \sigma^2) && \text{with } && \mu_i = \beta_0 + \beta_1X_{i1} + \beta_2X_{i2} + ... \beta_6X_{i6} \\ \beta_{0} &\sim N(2, 46^2)\\ \beta_j &\sim N(0, 50^2)\\ \sigma &\sim \text{Exp}(0.02) \end{align*}
Priors for model 'default_model_dt'
------
Intercept (after predictors centered)
Specified prior:
~ normal(location = 4.5, scale = 2.5)
Adjusted prior:
~ normal(location = 4.5, scale = 129)
Coefficients
Specified prior:
~ normal(location = 0, scale = 2.5)
Adjusted prior:
~ normal(location = 0, scale = 0.43)
Auxiliary (sigma)
Specified prior:
~ exponential(rate = 1)
Adjusted prior:
~ exponential(rate = 0.019)
------
See help('prior_summary.stanreg') for more details
Priors for model 'flat_model_dt'
------
Intercept (after predictors centered)
~ flat
Coefficients
~ flat
Auxiliary (sigma)
~ flat
------
See help('prior_summary.stanreg') for more details
Priors for model 'default_model_dow'
------
Intercept (after predictors centered)
Specified prior:
~ normal(location = 4.5, scale = 2.5)
Adjusted prior:
~ normal(location = 4.5, scale = 129)
Coefficients
Specified prior:
~ normal(location = [0,0,0,...], scale = [2.5,2.5,2.5,...])
Adjusted prior:
~ normal(location = [0,0,0,...], scale = [364.46,361.83,370.01,...])
Auxiliary (sigma)
Specified prior:
~ exponential(rate = 1)
Adjusted prior:
~ exponential(rate = 0.019)
------
See help('prior_summary.stanreg') for more details